Joint speech and audio coding combining sinusoidal modeling and wavelet packets
نویسندگان
چکیده
This paper presents a joint speech and audio coding algorithm combining sinusoidal modeling and a perceptually adapted Wavelet Packet Transform (WPT). The input signal is limited to the band of 50-7000 Hz, and sampled at 16 kHz. The sinusoidal modeling uses a Sinusoidal Similarity Measure (SSM) to find stable sinusoidal components. A novel pitch harmonics based encoding is applied to encode the sinusoidal frequencies. The residual is obtained by extracting the re-synthesized sinusoids from the input, and is processed by a WPT simulating the critical bands of the Human Auditory System. Perceptual Noise Substitution (PNS) is applied in noisy WPT sub-bands to reduce the bit rate. The method provides nearly transparent quality for both speech and audio inputs. The mean bit rate of the compressed signal varies between 32-62 kbps depending on the input. Demonstration sound files are available at www-sc.enstbretagne.fr/ ̃fek/eurospeech01.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملAdaptive Signal Models: Theory, Algorithms, and Audio Applications
Adaptive Signal Models: Theory, Algorithms, and Audio Applications by Michael Mark Goodwin Doctor of Philosophy in Engineering|Electrical Engineering and Computer Science University of California, Berkeley Professor Edward A. Lee, Chair Mathematical models of natural signals have long been of interest in the scienti c community. A primary example is the Fourier model, which was introduced to ex...
متن کاملMultiresolution sinusoidal modeling using adaptive segmentation
The sinusoidal model has proven useful for representation and modi cation of speech and audio. One drawback, however, is that a sinusoidal signal model is typically derived using a xed frame size, which corresponds to a rigid signal segmentation. For nonstationary signals, the resolution limitations that result from this rigidity lead to reconstruction artifacts. It is shown in this paper that ...
متن کاملAmplitude Modulated Sinusoidal Models for Audio Modeling and Coding
In this paper a new perspective on modeling of transient phenomena in the context of sinusoidal audio modeling and coding is presented. In our approach the task of nding time-varying amplitudes for sinusoidal models is viewed as an AM demodulation problem. A general perfect reconstruction framework for amplitude modulated sinusoids is introduced and model reductions lead to a model for audio co...
متن کاملTransitional speech segments modeling by matching pursuit with a dictionary based on the psychoacoustic adaptive WP
In this paper transitional speech segments modeling by matching pursuit is proposed. The dictionary for matching pursuit is composed of wavelet functions that implement of psychoacoustic adaptive wavelet filter bank. Psychoacoustically motivated entropy based cost functions allow to greatly minimizing a number of time-frequency atoms in wavelet packet (WP) dictionary. The given transient modeli...
متن کامل